Chapter 2: Logic and Languages

 

Notes

 

 

Computers are Smarter

In computery stuff, we have a hard drive and usually disks and printers. The data on the hard drive stays there even after we shut off power to it. We don't need to concern ourselves with how, just familiarize yourself with the concept. Likewise a printed piece of paper that contains data is permanent because it requires nothing except to be untampered. The only way to lose data on a hard drive or printed paper is to actively try to harm them.

In our machine, this simple program would most likely be fixed to the hardware itself, and unchangable. As machines gain complexity, their programs must be updated to extend their functionality or fix old functionality. Personal computers use these kinds of software programs.

Software programs are still instructions, but they exist on some kind of storage rather than on the machine's logical hardware itself. These are not already simply "known" by the machine, they must be loaded and run. In the world of computing you know, these are executable files on your personal computer.

When a software program is run, it is read by the machine using it's brain, the CPU (microprocessor). It follows the instructions of the program exactly. It stores what the program wants, retrieves what the program wants, outputs what the program wants, and feeds the program requested input. The CPU rarely denies requests and is therefore a powerful, but dangerous thing.

So, there is another platform on which much of this logic can be run: the operating system. The operating system is a software program or programs that can run other programs that are written specifically for it. When these higher level programs need access to the hardware of the machine, they will ask the operating system rather than talking to the hardware directly through the CPU. This is safer and because the operating system has a lot of logic already, higher-level programs can use that logic without writing more of their own.

 

What I will be teaching you to write are software programs to run on modern operating systems running on modern computers.  By modern, I mean within the last couple decades or so … I won’t be picky if you’re using this (link to old Tandy from Art Bell here!) computer.

 

Computers Don’t Speak Love

Everyone now and again hears or mentions the magical phrase “universal language”.  Let me tell you, computers don’t speak it.  In fact, if you thought there was a battle of the languages here in the real world, wait until you get into the cyber realm!  People already familiar with that aspect of debate know what I’m talking about.  What you’ll find is that many people will defend their favorite language to the death; worse than the crappy music they listen to!

 

But what am I talking about with languages?  At any of several levels in the computer’s logic there exists a language.  The language is the format of the logical instructions that comprise a program.  In personal computers, the very base language is that of the CPU itself and is known as “machine language”.  In fact, your computer itself knows nothing other than this dismal language.  The program with machine language is it’s simplified beyond human comprehension.  Every instruction is broken down to very, very simple things so more of them can be read into the CPU at once.  It doesn’t have to do any parsing (i.e. breaking the data into it’s logical and useful parts and then processing it) on them at all.  It would be the equivalent of instinct in animals.  They just know how to do some things, just like the CPU knows how to read these instructions brainlessly and endlessly.

 

In the dark ages of computing, humans were slaves to computers and were forced to write in machine language.  Then came the first of the translated languages: assembler.  It was hardly a step up from machine language, but I can actually put it in writing.  For example, to place the number 5 in one of the CPU’s registers (i.e. “brain cells”) you could write:

 

        Mov ax, 5

 

Since computers only really understand their machine language, programs written in these languages had to be translated.  No human would want that job, let me tell you, so the instructions were fed into a translator program whose instructions were already in machine language.  For assembler the process of translating is known as “assembling” and the program used to do it simply as “assembler”.

 

Assembler is known as a low-level language because it is exactly like machine language except mildly comprehensible.  Later on, more translated languages popped up but they were “higher-level” because they were even more readable and less tied to the computer’s machine language.  This sparked the idea of portability … you could take instructions from Machine A and put them on Machine B and compile them there into Machine B’s machine language.  These early high-level languages (such as “COBOL”) were good points in history but they had their short comings (they were the first, so this was inevitable).  For one thing, portability was really a myth.  You could take instructions from Machine A, then put them on Machine B, then pay someone to modify them for all the special “syntax” Machine B used, then have it compiled, then pay for repairs when it didn’t work, etc.  And they seemed to think that broken English was a great architecture for a “human readable” programming language.

 

I used some “new” terms there.  One was “syntax”.  It simply means the rules of a language (or “features”, but those are inclusive to rules).  The latter was “programming language”.  A high-level language is known as a “programming language” for the simple reason that you write in this foreign language for the purpose of programming.

 

The higher-level translated languages of which I have been speaking and we will focus on are called “compiled languages”.  C and C++ are compiled languages.  The term you figured was coming is “compiler”: the program used to translate compiled languages into machine language.

 

The fundamental thing you must realize here is that computers cannot read or understand C++ without it first being translated.  Another way a computer read instructions from a high-level language is through “interpretation” rather the “compilation”.  The term “interpretation” is used by languages whose instructions are read by a translator when the program is to be run.

 

Viva La Existance

The instructions for any language exist in files.  For high-level languages these files are known as “source” files as they contain “source code”.  The terms “source” and “source code” are synonymous … I suppose someone just got tired of saying “code” and dropped it and no one really has noticed since.  Source is actually just a way of saying “instructions”.  When I say “Here my C++ source”, what I mean is “Here are my instructions written in C++”.  For C++, these files usually have one of the following extensions: “c”, “cpp”, “cxx”, “h”, or “hpp”.  I’ll get into what each of them typically means later.

 

Most machine language exists in files as well.  These are typically referred to as “executable” files.  They don’t have a larger salary than other files, but the machine does understand them natively.  These files can be executed or “run” (not killed) natively by the machine.  On DOS/Windows machines, these files have the extension “exe”.

 

Your C++ source files are not executable … though you would like them to be.  You cannot simply run them because the computer doesn’t understand their language; it only knows the blasphemous machine language.  If you try to “run” a C++ source file on Microsoft Windows, by say “double-clicking” it (a popular choice for many people today – double clicking that is) … it will either open the file using a program (like Microsoft Visual C++, TextPad, etc.) or ask you what you want to do with it.  But it cannot run the file; it isn’t executable.  Remember you must compile (translate) your C++ source files, and therefore your C++ instructions, into machine code and an executable file.  More succinctly put, you must translate your source file into an executable file.

 

Notice I said source files (plural).  This is an understandability issue.  Usually programmers will keep their program’s source code in multiple source files.  This makes it easier to manage and change.  These files will all get compiled (remember that means translated) into a single file of machine code.  That is, of course, if they are for a single project.  You see, not all machine code is in a single file either.  When you keep those instructions in separate files you only have to change those files to change that set of instructions.  Did you think Windows was stored in a single executable somewhere on your system that you could doink with?  No, it’s stored in many, many files.

 

Not all executable files can be run directly.  Some of them are sucked in when other files are run.  For example, DLL’s (or SO’s if you’re on a Unices platform) are executable files but you cannot run them directly.  They need additional information to run that you cannot give them; other programs must.

 

Other Things That Run … or Crawl

Now, if you’re fairly adept at operating a computer you might be wondering about those other programs that you can “run”.  Things like command scripts, batch files, etc.  These are instructions, no doubt about it, and they are not machine language because usually you can actually read (and edit) them with a text editor.  The instructions in these files is still translated just like C++ instructions are compiled and Assembler instructions are assembled.  The translation for these types of files is known as “interpretation”.  Yeah, it’s a bit silly but it’s true.  The interpreter (program that translates interpreted languages) has all the machine language at it’s disposal that the interpreted language will ever use.  It reads the instructions, interprets them, executes some machine language that equates to the instruction, and then moves on to the next one.  This all happens in “real time” and no executable file is ever generated.

 

These languages are also sometimes called “scripting” languages.  They include the ever popular JavaScript, VisualBasic Script (VBA), QBasic, and batch files.

 

The typical advantage of interpreted languages is that they’re easy to understand, quick to write (since the interpreter usually knows a lot so the programmer has to know less), and it is easier to write extremely “dynamic” (ever-changing) things.  The major disadvantage is speed, followed by rules and portability.  Since you are insulated from the machine itself by the interpreter, you can make less mistakes but in being further away your program is slower.  High performance programs are not typically written in script (a short-name for an interpreted language).  Many utilities, however, are and for good reason.

 

Since this tutorial is meant to school you in C++, a compiled language, I will not spend any more time on interpreted languages.

 

Interpreted Compiled Instructions

Yes, there is such a thing as instructions that are compiled into an interpreted language.  On very popular example is Java.  The Java source is written in a human-readable format.  It is then compiled into byte-code which is instructions for a “virtual machine” (fancy name for an interpreter).  The advantage of this is that the interpreter is much faster because the format it must understand is much simpler (almost machine-code like).  The disadvantages are still basically the same as an interpreted language; except now you have to worry about some other things.

 

Again, I will speak no more of these byte-code languages as C++ is not one of them.  If you’re interested in more information, see here: …